Ribosome recycling, diffusion, and mRNA loop formation in translational regulation 
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ABSTRACT We explore and quantify the physical and biochemical mechanisms that may be rel- 
evant in the regulation of translation. After elongation and detachment from the 3' termination site 
of mRNA, parts of the ribosome machinery can diffuse back to the initiation site, especially if it is 
held nearby, enhancing overall translation rates. The elongation steps of the mRNA-bound ribo- 
somes are modeled using exact and asymptotic results of the totally asymmetric exclusion process 
(TASEP) [Derrida & Evans 1997]. Since the ribosome injection rates of the TASEP depend on the 
local concentrations at the initiation site, a source of ribosomes emanating from the termination 
end can feed back to the initiation site, leading to a self-consistent set of equations for the steady- 
state ribosome throughput. Additional mRNA binding factors can also promote loop formation, 
or cyclization, bringing the initiation and termination sites into close proximity. The probability 
distribution of the distance between the initiation and termination sites is described using simple 
noninteracting polymer models. We find that the initiation, or initial ribosome adsorption binding 
required for maximal throughput can vary dramatically depending on certain values of the bulk ri- 
bosome concentration and diffusion constant. If cooperative interactions among the loop-promoting 
proteins and the initiation/termination sites are considered, the throughput can be further regulated 
in a nonmonotonic manner. Potential experiments to test the hypothesized physical mechanisms 
are discussed. 
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INTRODUCTION 

The rate of protein production needs to be constantly 
regulated for all life processes. Genetic expression, pro- 
tein production, and post-translational modification, as 
well as transport and activation, are all processes that 
can regulate the amount of active protein/enzymes in a 
cell. Although much recent research has focused on the 
biochemical steps regulating the switching of genes and 
rates of transcription, translational control mechanisms, 
post-translational processing, and macromolecular trans- 
port are also important. For example, during embryo- 
genesis, nuclear material is highly condensed, transcrip- 
tional regulation is inactive, and translational control is 
important [Browder 1991, Wickens et al. 1996]. In other 
instances, transcriptional regulation is accompanied by 
long lag times, particularly with long genes. Transla- 
tional regulation is also the only means by which RNA 
viruses express themselves. 

Protein production, as with other cellular processes, 
requires the assembly of numerous specific enzymes and 
cofactors for initiation. This assembly occurs in free so- 
lution and on the 5' initiation site of mRNA. Translation 
involves unidirectional motion of the ribosome complex 
along the mRNA strand as amino-acid-carrying tRNA 
successively transfer amino acids to the growing polypep- 
tide chain. Images of mRNA caught in the act of transla- 
tion often show numerous ribosome complexes attached 
to the single-stranded nucleotide (Fig. lA). The multi- 
ple occupancy is presumably a consequence of very active 
translation, when many copies of protein are desired. 

Under certain conditions, the local concentration of 



tRNA, ribosomes, initiation factors, etc., will control 
protein production. One possible physical feedback 
mechanism underlying all the other biochemical regu- 
lation processes utilizes local concentration variations 
of the components of translation machinery. More- 
over, there is ample biochemical evidence that the 5' 
and 3' ends of eukaryotic mRNA interact with each 
other, aided by proteins that bind to the poly(A) tail 
and/or regions near the initiation site [Sachs 1990], 
particularly if the 5' initiation terminus is capped. 
The presence of both a poly(A) tail and a 5' cap 
have been found to synergistically enhance translation 
rates in a number of eukaryotic systems [Gallic 1991]. 
Numerous proteins that initiate translation, such as 
eukaryotic initiation factor eIF4, have been identified 
to bind to the cap and initiate ribosomal bind- 
ing [Mathews et al. 1996, Munroe & Jacobson 1990, 
Preiss & Hentze 1999, Sachs 2000]. A different set of 
proteins, poly(A) binding proteins (PAB) such as Pablp, 
are found to bind to the poly(A) tail. The proteins 
on the 5' cap and the poly(A) tail are also known to 
form a complex (cap-eIF4E-eIF4G-Pablp-poly(A) tail) 
which can increase translation rates [Jackson 1996, 
Munroe & Jacobson 1990, Sachs 1997, Sachs 2000]. In 
vitro solutions of capped, poly(A)-tailed mRNA, tRNA, 
and ribosomes fail to display synergy [Gallic 1991], 
indicating that additional factors are required for coop- 
erative interactions between the cap and the poly(A) tail. 
However, in vitro systems that include caps, poly(A) 
tails, elF's, and PAB's reveal circularized mRNA struc- 
tures in electron micrograph (EM) (Fig. I A) and atomic 
force microscopy (AFM) (Fig. IB) images. In this way, 
it is thought that various components of the translation 
machinery can be recycled after termination without 
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completely reentering the enzyme pool in the cytoplasm. 




FIG. 1: (A) An electron micrograph of polysomes on mRNA. 
(B) An AFM micrograpii of circularization of mRNA medi- 
ated by loop forming proteins. From Wells et ai, (1998). 
These images are of double stranded RNA of approximate 
length 2-4 X the dsRNA persistence length. Single stranded 
end segments with loop binding factors comprise the ends. 

Even in uncapped mRNA, there is evidence that 
certain sequences in the terminal 3' untranslated re- 
gion (UTR) can enhance translation to levels compara- 
ble to those seen in capped mRNAs [Wang et al. 1997, 
Jackson 1996]. Additionally, there are indications 
that proteins near the termination end can, upon 
contact, directly activate [Gallie 1991] or inactivate 
[Curtis et al. 1995, Dubnau & Struhl 1996] ribosome en- 
try at the 5' initiation site. Loops also appear to be a 
common motif in DNA structures [Goddard et al. 2000, 
Martin & Hagerman 1996] and appear to take part 



in transcriptional regulation [Martin & Hagerman 1996, 
Dunn et al. 1984, Wyman et al. 1997]. Double stranded 
DNA has a much longer persistence length than single- 
stranded nucleic acids (such as mRNA) and is much 
less likely to form loops without accompanying bind- 
ing proteins or specific sequences. Direct evidence for 
RNA "circularization" is shown in Figure IB, which 
shows loop formation of relatively short double-stranded 
mRNA in the presence of loop-binding factors at their 
ends [Hagerman 1985]. It is reasonable to expect that 
the more flexible single-stranded mRNA decorated with 
ribosomes can form similar loops. Besides the AFM- 
imaged loop of double stranded RNA shown in Fig. 1_B, 
there is also substantial evidence, particularly in viral 
mRNAs, that base pairing between uncapped 5' regions 
and non-polyadenylated 3' regions forms closed loops of 
many kilobases [Wang et al. 1997]. This loop formation 
by direct base pairing, or "kissing," is a very plausible 
mechanism by which the 3' UTR recruits ribosomes and 
delivers them to the 5' initiation site [Guo et al. 2001]. 

In this paper, we model the proposed cyclization, 
or "circularization" [Sachs 1997] and ribosome recycling 
mechanisms. Cooperative interactions of the initiation 
and termination sites with elF's and PAB proteins will 
also be considered within a number of reasonable as- 
sumptions. Since translation employs an immense di- 
versity of mechanisms and proteins that vary greatly 
across organisms [Mathews et al. 1996], we will only de- 
velop an initial, qualitative physical picture of cytoplas- 
mic mRNA translation consistent with the ingredients 
mentioned above. Three different coupled effects are 
considered in turn: (i) a totally asymmetric exclusion 
process (TASEP) describing the unidirectional stochas- 
tic motion of the ribosome along the mRNA, {ii) the 
diffusion and adsorption/desorption kinetics from the 
mRNA initiation/termination sites, and {Hi) the poly- 
mer physics associated with how the termination and 
initiation sites are spatially distributed relative to each 
other. The ribosome density along the mRNA, as well as 
the time-averaged throughput of ribosomes, the ribosome 
"current," are described by solutions of the TASEP. The 
parameters in the TASEP are the internal hopping rates 
and the injection and extraction rates at the initiation 
and termination sites, respectively. Since ribosome com- 
ponents that diffuse in bulk must adsorb on the initiation 
site, the injection rate used in the TASEP will be pro- 
portional to the local concentration of the rate-limiting 
ribosome. Ribosomes that reach the termination site des- 
orb and reenter the pool of diffusing ribosomes. The 
distance between the termination end and the initiation 
site, when ribosomes are released, can thus influence the 
absorption rate and hence the overall translation rate. 
The initiation-termination end-to-end distance distribu- 
tion can be estimated with basic polymer physics. The 
end-to-end distance distribution can include effects such 
as specific binding of poly(A) associated proteins with 
the 5' cap, thereby forming a loop, bringing the initiation 
and termination sites into close proximity. Although our 
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model applies only to cytoplasmic mRNA translation, 
many of its components can also be adapted to treat 
mRNA adsorption on endoplasmic reticulum (ER) and 
ER-assisted translation. 

PHYSICAL MODELS 

We now consider the physical processes necessary to 
describe the above-mentioned translation processes. At 
the relevant time scales, we will see that fluctuations in 
these physical mechanisms are uncorrelated with each 
other. This allows us to consider simple steady-states 
where time or ensemble averages of the TASEP, ribo- 
some diffusion in the cytoplasm, and the mRNA chain 
conformations are uncorrelated and can be taken inde- 
pendently of each other. A simplifying schematic of the 
basic ingredients of mRNA translation is given in Fig. 2. 

The Asymmetric Exclusion Process 

The TASEP is one of a very small number of 
interacting nonequilibrium models with known ex- 
act solutions. Asymmetric exclusion models have 
been used to effectively model qualitative fea- 
tures of diverse phenomena including ion transport 
[Hahn et al. 1996, Chou 1999, Chou & Lohse 1999], 
traffic flow [Schreckenberg et al. 1995], and the ki- 
netics of biopolymerization [MacDonald et al. 1968, 
MacDonald & Gibbs 1969]. Briefly, the model consists 
of a ID lattice of N sites, each of approximately 
the molecular size of a ribosome unit. Each variable 
Gi = {0, 1} represents the ribosome occupation at site i 
of the coding region of mRNA. Each site can be occu- 
pied by at most one ribosome and the mean occupation 
ai = (ai) at each site 1 > tJi > 0. The probability 
in time dt that an individual ribosome moves forward 
to the next site (toward the 3' end) is pdt, provided 
the adjacent site immediately in front is unoccupied. 
Backward moves are not allowed, since ribosomes are 
strongly driven motors that move unidirectionally from 
5' to 3'. The entrance and exit rates at the initiation 
(i = I) and termination (i = N) sites are denoted a and 
P, respectively (cf. Fig 2C). The exact steady state solu- 
tions to this kinetic model, including the average density 
(Tj, and the mean particle (ribosome) current have been 
found by Derrida and Evans [Derrida et al. 1993], using 
a matrix product ansatz, and by Schiitz and Domany 
[Schutz & Domany 1993], using an iteration method. 
An exact representation for the steady-state current 
across an A^-site chain is [Derrida et al. 1993] 




FIG. 2: A cartoon of mRNA translation in eukaryotes. The 
intermediary proteins and cofactors are not depicted. (A) An 
mRNA chain loaded with ribosomes (green), in various stages 
of protein (black) production. Ribosomal components as well 
as other components such as tRNA exist at a uniform back- 
ground concentration. The initiation and termination sites 
are additional sinks {i — 1) and sources (i = N), respectively, 
of ribosomes. (B) Binding factors (yellow and dark grey) can 
increase the probability of loop formation or "circularization," 
which brings the poly(A) tail (red) in better proximity to the 
initiation site, enhancing ribosome recycling. (C) Schematic 
of the associated TASEP with injection (a), internal hop (p), 
and desorption (/3) rates labelled. 



In the iV oo hmit, the ID TASEP (Eq. 1) admits 
three nonequilibrium steady-state phases, representing 
different regimes of the steady state current J: 
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FIG. 3: The infinite chain (A^ — > oo) limit noncquilibrium 
phase diagram of the standard TASEP. The maximal current 
(II), low density (I), and high density (II) phases and their 
corresponding steady state currents are indicated. In this 
and subsequent phase diagrams, solid curves correspond to 
phase boundaries across which the slope of the steady-state 
currents (with respect to the parameters) is discontinuous. 
Across the dashed phase boundaries, the currents and their 
first derivatives are continuous. 



The phases (I), (II), and (III) defined by Eqs. 3 arc 
denoted as the maximal current, low density, and high 
density phases, respectively, and are delineated in Fig. 
3 by the dotted phase boundaries. Qualitatively, when 
f3 is small, and injection rates are faster than extraction 
rates (a > /3) , the rate- limiting process is the exit step at 
i = N. Therefore, the high occupancy phase (II) has a 
low current which is a function of only the slow step /3. In 
the opposite limit of fast desorption at i = TV, and slow 
injection at i = 1 (small a), the chain is always nearly 
empty, and has a small current J that depends only upon 
the rate limiting step a. For large a ^ (3, the system 
attains maximal current J — p/A where the effective rate- 
limiting steps are internal hopping rates p. In this phase, 
the constant current J = p/4 is independent of further 
increases in a or /?. The ribosomal currents given by 
Eqs. 3 and the associated phase diagram in Fig. 3 are 
valid only in the N ^ oo limit. Nonetheless, the N = oo 
phase diagram is qualitatively accurate for the currents 
expected at large but finite N. 

There may appear to be a microphysical inaccuracy: 
The TASEP defined above corresponds to individual 

movements with step length equal to the ribosomc size. 
However, ribosomes typically occlude ~ 10 codons, so 
that it takes ~ 10 microscopic steps for the ribosome to 
move the distance of its own size [Lakatos & Chou 2003, 
Shaw et al. 2003]. An accurate approximation for the 



throughput J (Eq. 1) is to assume that each step be- 
tween two sites defined in our model consists of ^ 10 
actual tRNA transfers. The effective rate p is thus 
the average tRNA transfer rate reduced by a factor 
of ~ 10. With this consideration, the TASEP com- 
pletely determines the steady state ribosome through- 
put as long as the effective rate p is appropriately de- 
fined. Therefore, we will treat the mRNA translation 
problem using steps sizes equal to the ribosome size, with 
the understanding that for appropriately rescaled transi- 
tion rates, our results will be qualitatively correct. The 
exact currents of a TASEP, where the particle diame- 
ters are q times the step size, is given in Appendix B 
[MacDonald et al. 1968]. Explicit Monte-Carlo simula- 
tions have also been performed on large particle/ small 
step size dynamics to confirm the accuracy of the results 
[Lakatos & Chou 2003, Shaw et al. 2003]. 

What remains is to determine the self-consistent de- 
pendence of the model parameters, in particular a and 
/?, on the local ribosome concentration (which in turn 
depends on the mean current J), diffusion rates, circu- 
larization, etc. For example, the injection rate a at the 
initiation site will be proportional to a microscopic bind- 
ing rate k times the local ribosomc concentration. 

Steady-State Release, Diffusion, and Capture 

The complete mRNA translation machinery is ex- 
tremely complicated, since it is comprised of many aux- 
iliary RNA and protein cof actors, as well as a collec- 
tion of active mRNA chains. Since there are many ac- 
tive mRNA chains in the cytoplasm, each mRNA chain 
feels the sinks (initiation sites) and sources (termination 
sites) of all the other mRNA chains. However, these 
other randomly distributed chains, each with their own 
initiation and termination sites, contribute an averaged 
background ribosome concentration. Thus, it is only the 
termination site (ribosome source) associated with the 
initiation site on the same mRNA chain that resupplies 
the initiation site in a correlated manner. We thus con- 
sider a single "isolated" mRNA chain and for the sake 
of simplicity, assume that a single component, say phos- 
phorylated elongation initiation factor eIF4F or eIF2, say 
[Clemens 1996, Sachs 2000], is key to a rate limiting step. 
We will generically call this component the "ribosome." 
Consider a source of newly-detached ribosomes (emanat- 
ing from the 3' termination site) at position r away from 
the 5' initiation site. The probability of finding this par- 
ticle within the volume element dr about r obeys the 
linear diffusion equation with the termination site acting 
as a source, 



dtP{r,t) - DW^P{v,t) = J{t)Weff{r,t), 



(4) 



where D is the bulk ribosome diffusion constant, J{t) is 
the instantaneous rate of ribosome release from the ter- 
mination end, and T^eif (r)dr is the probability that the 
termination site is within the positions r and r + dr from 
the initiation site. Although Eq. 4 can be solved exactly 
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for all times, the TASEP result (Eq. 1) is appropriate 
only in the steady-state, so we must consider that limit 
for all processes. 

The typical mRNA passage time of a single ribosome 
is on the order of one minute. The bulk diffusion con- 
stant of the 10-15nm radius (a ~ 15nm) ribosome unit 
is ~ 10~^ — 10~^cm^/s. A ribosome molecule will dif- 
fuse the length of a IkB pair mRNA strand in ~ 0.1s. 
Therefore, with each release of a ribosome from the ter- 
mination site, the probability density appears as a pulse 
which passes through the initiation site over a time scale 
shorter than it takes for a ribosome to stochastically hop 
a few lengths of its size along the mRNA chain. There- 
fore, an upper bound on the amount of correlation be- 
tween concentration fluctuations and ai can be found 
by considering the equal time two-point correlation in 
the maximal current phase (aiaAr) — cTicrjv ~ 
[Derrida & Evans 1993]. Two-point correlations in other 
current regimes are smaller, and decay exponentially with 
N [Essler & Rittcnbcrg 1996]. Therefore, wc can neglect 
the correlation of the current J(t) with the occupancy ai 
at the initiation site. Moreover, the end-to-end distribu- 
tion Weff arises from the statistics of the mRNA polymer 
configurations and is also assumed independent of both 
J{t) and <7i. The steady state ribosome distribution can 
thus be found by setting dtP{r,t) = on the left-hand- 
side of Eq. 4 and taking the time, or ensemble, average 
of the remaining Poisson equation to obtain 



(V2p(r,t))=V2C(r) = --We#(r), 



(5) 



where J = {J{t)) is the steady-state current of ribosomes 
emanating from the termination end of the mRNA re- 
entering the bulk ribosome pool, and C(r) = (P(r, t)) is 
the ensemble average of P(r). 

The boimdary condition for C(r) at the initiation site 
will depend on the occupancy of that site. When it is 
empty, there is a flux due to the microscopic adsorp- 
tion step onto the first site. When cti = 1, the bulk 
ribosome probability distribution will obey perfectly re- 
flecting boundary conditions. Since the probability at 
r = a, P{r = a,t) depends on the occupation cti, 
{P{a,t)<Ti) ^ C{a)ai. The mean concentration at r = a 
must be found by averaging the currents in the two states, 
0-1 = 1, and 0-1=0. When the initiation site is empty, 



J(cti = 0) = Jo = Aira^DdrCir = a) = kC{r = a). (6) 

Since the steady state current J(o-i = 1) = when the 
initiation site is full, the averaged steady-state current is 



J = (1 - oi) J(si = 0) + ai J(oi = 1) = (1 - <7i) Jo, (7) 

where (1 — ai) is the fraction of time that the initia- 
tion site is unoccupied, ready to absorb a ribosome from 
the bulk. This probability is not directly dependent on 



the distribution Weffir), but will depend on the time- 
averaged local concentration C (r) , which in turn depends 
on Weff only through the distance of the source site at 
i = N. 

The solution to Eq. 5, obeying the boundary condi- 
tions Eq. 6 and C(r — > oo) = Coo, is 



C(r) = Coo - ^ 



ka 



(8) 



r \ iwDa + k 

+ ljdv'G{v-v')Weff{v'), 

where r is distance measured from the initiation site, and 



G(r,r') = 



47rlr — r' 



oo,m—-\-l p 

E 

£=0,m=-£ 



ka^ - Awa^Dia'^ 



ka-^-'^ + 4iTa^D{i + l)a-^-2 



(9) 



{2e + l)r 

is the associated Green function. In Eq. 9, r<(r>) is 
the smaller(larger) of |r|, |r'| and Ygm{^) are the spher- 
ical harmonic functions of the solid angle CI defined by 
the vector r [Arfken 1985]. The flrst two terms in Eq. 
8 arise from the uniform concentration Coo at infinity 
and the effects of a sink of radius a at the initiation 
site. The sink decreases the effective concentration to 
a level below that of Coo- The last term proportional to 
J increases the local concentration and is the result of 
the source (termination site) some finite distance away 
from the initiation site. If fc — > 0, and ribosomes do not 
bind even when the initiation site is empty, the current 
J must vanish, and C(r) Coo, as expected. However, 
one cannot simply consider the limit fc — > oo in Eq. (8) 
because k and oi are related through J, the current de- 
termined by the TASEP in the rest of the chain. This 
can be seen by considering the limit fc ^ oo. If the rest 
of the TASEP contains the rate-limiting step to ribo- 
some throughput, making J very small, it will effectively 
block clearance of the initiation site, since all sites of the 
chain will be nearly occupied. In this case, oi « 1 and 
fc(l — oi) is small (despite a large k), and C(r) « Coo, as 
expected. However, if the rest of the chain is not rate- 
limiting, and if clearance of the initiation site can occur 
fast enough, oi < 1 and fc(l — cti) can be large. In this 
case, C(r) « Coo(l - a/r) + JD-^ J dr'G{r - r')Weff (r')- 
The TASEP current J will eventually be balanced with 
J = (1 — oi)Jo. Note that J is determined by Eq. 1 
which in turn depends on the entry rate a (in other 
words, kC{a)). Thus, steady-state currents need to be 
self- consistently determined, since C(a) and oi are not 
parameters, but dynamical variables that will in turn be 
determined by setting J = (1— cti) Jq. The analysis which 
uses Eq. 1 to find self-consistent explicit expressions for 
J will be presented in the Results and Discussion. 
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Since the averaged bulk concentration profile is spher- 
ically symmetric about the initiation site, only the £ = 
terms in the expression for G{r — r') survive and 



Jo =kC{a) 

ina^DkCc 



Ana'^DkJ 



ka + Aira^D AT:D{ka + Ana'^D) 



J 



dr 

r'>a 



k + AiraD 



K 



where 



R ^r' 



dr 



(10) 



(11) 



The surface concentration at the sink surface a is re- 
duced from the "bulk" value by a factor of 1 -I- AnaD/k, 
due to adsorption and diffusional depletion. However, 
part of this initiation site concentration is also replen- 
ished at a rate proportional to the flux J, due to the 
presence of a nearby termination (source) site. The ef- 
fects of this replenishment are measured by the mean 
inverse separation 1/R. The "harmonic distance" R de- 
fines the effective distance felt by diffusing ribosomes 
as they make their way from the termination end back 
to the initiation site. This particular scaling is a 
consequence of the solution to Poisson's equation (Eq. 
5) in three dimensions, and is related to the capture 
probability of diffusing ligands, as analyzed by Berg 
and Purcell [Berg & Purcell 1977]. Equation 10 contains 
two unknowns, C(a) and cti. We can use the explicit 
solution Eq. 1 if we identify the injection rate a of 
the TASEP with the unoccupied initiation site current 
Jo = kC{a) = a. Equation 1 then relates kC{a) to cti. 
A second equation can be used by noticing that the flux 
itself must be balanced. Upon using J = fcC(a)(l — cri) 
in Eq. 10, a second relationship between kC{a) and cti 
can be found. Substitution of the solution for kC{a) (in 
terms of experimentally known or controlled parameters 
k,Cao, a, R, D) into Eq. 1 determines the self-consistent, 
steady-state ribosome current. This analysis, using the 
three different explicit forms of Eq. 1 (in the long chain 
limit) is presented in the Results and Discussion. 

End-to-End Distribution Weff 

We now flnd Weff (r) in order to compute R and obtain 
C{a). In some cases, the mRNA chain may be anchored 
to cellular scaffolding or ER membranes such that the 
initiation-termination separation is fixed. If one is in- 
terested in steady-state protein production over a period 
which allows little change in initiation-termination dis- 
tance, Weffiv) = 5(r — R), and i? = |R|. In other cases, 
the mRNA may be free to explore numerous conforma- 
tions on the protein production time scale. Although it is 
possible that long mRNA strands may contain secondary 
structure, we will assume that ribosomes, as they move 



along the mRNA, melt out these structures. Although 
there is evidence that mRNA can contain small, local 
loops [Hagerman 1985, Wang et al. 1997], it is less likely 
that they have larger-scale tertiary structure. Thus, we 
will estimate W^ff and R with simple polymer models. 



m 
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FIG. 4: A schematic of the efltects of loop forming factors. 
The coding region of the mRNA is blue (the ribosomes and 
the poly- A tail are not shown), the noncoding spacers of m 
and n persistence lengths e are solid black, while the the ne- 
glected short ends are dashed curves. The loop binding factors 
are of typical size d. [A) Nonlooped conformations in which 
the initiation-termination site distribution function is gov- 
erned by VF(r|open). {B) The initiation-termination distribu- 
tion function in looped configurations is denoted lV(r|loop). 
VF(r|loop) is weighted more strongly at small |r| relative to 
V[^(r|open). For stronger attraction between loop binding fac- 
tors the probability of loop formation increases, decreasing the 
effective distance R that ribosomes must diffuse to be recycled 
back to the initiation site. 

As shown in Fig. 2, the mRNA is comprised of three 
segments divided between two qualitatively distinct re- 
gions. Typical coding regions are ^ 10^ base pairs, cor- 
responding to iV ~ 300. At low ribosome densities, the 
uncovered mRNA base pairs will be rather flexible, and 
the effective persistence length I will be a local aver- 
age between a and the 2-4 nucleotide persistence length 
e of uncovered mRNA. Large reductions in the persis- 
tence length of dsDNA containing segments of single- 
stranded regions have also been observed by Mills et al. 
[Mills et al. 1994]. More sophisticated theories for vari- 
able persistence lengths can be straightforwardly incor- 
porated; however, for simplicity, we approximate the per- 
sistence length in the coding region to be a uniform con- 
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stant on the order oil = a, the individual ribosome exchi- 
sion size. The contour length of the coding region is thus 
= Na with ^ 50 — 500. The untranslated regions, 
or UTRs between the initiation site and the binding fac- 
tor (dark gray), and between the termination site and the 
loop-binding factor (yellow), with persistence lengths e, 
have contour lengths of Lm = me and i„ = ne, respec- 
tively. Typical Lm,Ln are on the order of 100 bases so 
that n, m ~ 20 — 50. However, extremely long noncoding 
segments of order Ikbp can exist [Mathews et al. 1996] 
where m,n 300. In what follows we will also neglect all 
the excluded volume effects of the remaining short ends 
of the mRNA chain. 

As demonstrated by Wells et al. [Wells et al. 1998] 
in Figure IB, mRNA can form loops in the pres- 
ence of binding proteins. Therefore, we expect that 
Weff{r) (and hence 1/R) will be a linear combination 
of Ty(r|open) and W(r|loop), the initiation-termination 
probability distributions in open and looped mRNA 
configurations, respectively. These configurations are 
shown in Figs. 4 A, B. For simplicity, we will use 
probability distributions associated with noninteract- 
ing (phantom) chains and approximate the distributions 
W{t) with both a freely jointed chain (FJC) and worm 
like chain (WLC) models with appropriate persistence 
lengths The finite-sized, short distance behavior of the 
H^(r|open, loop) will be important for accurately com- 
puting (1/r). As we will see, VF(r|loop) can be con- 
structed from the more fundamental quantity 14^ (r| open) 
[Liverpool & Edwards 1995, Sokolov 2002]. Since we are 
eventually interested in either ribosome transport from 
termination to initiation or in activation/deactivation of 
initiation or release sites due to direct contact with the 
end proteins, we compute in Appendix C the distance 
distribution W(r|open) in the state where site i = iV is 
occupied and site i = 1 is unoccupied. 

Using the VF(r|open) computed in Appendix C, we can 
thus consider the contributions of looped configurations 
to the effective end-to-end distance distribution. The 
binding energy between the 5'-cap and poly(A) tail pro- 
teins, — C/q (in units of ksT), determines the probability 
that the chain is looped: 



Pioop{n,m,N;Uo) = 



exp(-G';oop) 



exp{-Gioop) + exp{-Gopen) 

fo 

e^o -I- f2o(open)/f2o(loop) ' 

(12) 

where the free energies of a closed and open mRNA 
chain are Gioop — Uq Sioop and Gop^n — Sgpgni 
respectively. Since the ratio of the number of configu- 
rations under looped and open chain conditions is the 
ratio of probabilities of loop formation in the absence of 
head-tail interactions {Uq = 0), f2o (open) /i7o (loop) = 



Ploop — 



'P, 



(01 



p(o) ^1 ■ 

loop J 



(13) 



The probability that the ends of a noninteracting (in the 
absence of loop binding proteins) chain intersects itself 
within the interaction volume defined by a thin spheri- 
cal shell of thickness 5 (the binding interaction range) is 
approximately 



Plolp ~47rrf^5 / T4^e (r„ I open) Wa (r„+iv - I open) 
X We(rm+jv+n |open)dridr2 



(14) 

where d is the typical size of the loop binding factors and 
Lt = ^JL'l^ + + LI = ^Na-^ + {m + n)e-^. We have 
assumed the total radius of gyration Lt S> a, and used 
a Gaussian chain as a qualitative approximation for the 



distributions used in the calculation of P, 



(0) 



loop' 



The con- 



ditional probability distribution W(r|loop) for a looped 
chain is 



VF(r|loop) = 



Wa(r|open)VF£(r|open) 
/r>a W^a(r'|open)VF£(r'|open)dr'' 



(15) 



(l ^loop) / ^loop- 



,(0) 



and 



where W^(r|open) denotes the single segment, open chain 
probability distributions in the two segments with per- 
sistence lengths £ = a,e. For a\/N ':$> e\/m, + n, the loop 
distribution given by Eq. 15 is qualitatively similar to the 
distribution function We (r| open) of the short segment of 
persistence length e. 

Using Eqs. 13, 14, 15, and C5 we construct the effec- 
tive initiation-termination distance distribution 



We#(r) = (l-P,oop)W^(r|open) + P,oopW(r|loop). (16) 

Weff{r) is plotted in Appendix C (Fig. 11) for vari- 
ous Uq. Qualitatively similar loop probability distribu- 
tions have also been computed by Liverpool and Edwards 
[Liverpool & Edwards 1995] within the WLC model but 
without finite-sized molecules at the ends. Here and in all 
subsequent analyses, we use the typical parameters s/a = 
0.2, d = a, and S/a = 0.1. As Uq is increased, the dis- 
tance distribution function switches over from T4^(r]open) 
to W(r]loop). The statistics of W(r]open) and VK(r|loop) 
are governed by Ln = Na and Lmn = (™ + n)e, respec- 
tively. The loop forming factors, since they are close to 
the initiation and termination sites (imn 

< Ljv), en- 
hance the probability that the ends are close to each 
other, particularly when the binding energy Uq is large. 
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FIG. 5: The effective diffusional distance or "harmonic dis- 
tance" R/a = [a J drWeff{r)/r] over which recycled ribo- 
somcs must diffuse. (A) The dependence of R/a as a function 
of loop binding energy Uo is shown for — 100 persistence 
lengths of coding mRNA. For large binding energies Uo, the 
initiation and termination sites are brought closer together. 
The crossover between the end-to-end distribution function 
of a free chain to that of a loop occurs near Uq ~ 8. In- 
creasing the length of the short noncoding ends of the mRNA 
predominantly increases the typical distance R in the large 
Uo, looped regime. {B) The A'^-dependence of R/a with the 
ratio of noncoding persistence lengths to coding persistence 
lengths {m + n)/N = 1/2. The A/^-dependence manifests it- 
self primarily in the low Uo, open chain regime. (C) The TV 
dependence of R/a for various Uo. 



The harmonic distance, R, determined using Weff is 
shown in Figs. 5A, B as functions of loop binding energy 
Uq. The result given by the last line in Eq. 14, when used 
in Eqs. 13 and 16 qualitatively describes a crossover in 
Weff from VF(r|open) to H^(r|loop) behavior at 



In 



6 \~d 



T 



0{S5/Ll). (17) 



In Fig. 5 A, R/a is shown with N = 100, but at various 

noncoding lengths m + n. In the large binding strength 
limit, R/a depends only on the short distance {m + n)e. 
When loops rarely form, the typical separation between 
initiation and termination sites can only depend on L^r 
which is the only quantity varied in Fig. bB. Notice that 
the exact FJC solution (Appendix C), or truncated WLC 
solution for W(r < ajopen) = ensures that R/a > 1 
for all values of m, n, N, and Uq. The dependence of R/a 
on TV is shown in Fig. 5C for various Uq. When Uq is 
small, the initiation-termination harmonic distance R is 
controlled by ijv and increases as VN. For larger Uq, the 
chain is partially bound into a loop where the distance 
is controlled by the much shorter Lm+n- The harmonic 
distance R remains small unless N becomes extremely 
large so that entropy can dominate and the loop ends 
can unbind. 

We now couple our mathematical models by incorpo- 
rating the Wejgr -weighted inverse harmonic distance a/R 
into the local, effective concentration C{a: R) given by 
Eq. 10. The effective injection rates a = kC{a) that con- 
trol the translation rate within the steady-state TASEP 
are then self-consistently determined. 

RESULTS AND DISCUSSION 

Here, we compute the possible currents J and the pa- 
rameter space in which each are valid. We will use the 
exact solution Eq. 1, or its three asymptotic forms (Eqs. 
3), as well as J = fcC(a)(l — cti) in Eq. 10, to find all 
relevant quantities and parameter phase boundaries. 

Substitution of J = fcC(a)(l — (Ti) into Eq. 10 and 
solving for cti, we find 



1 - (71 = I 1 



R 
+ -. 

a 



k \' C{a) 
Upon multiplying Eq. 18 by kC{a), we find 

kC{a){l - (Ji) = ATTDR{C{a) - C^) + -kC{a) 



(18) 



f fcC(a)(l - kC{a)/p) 



= JikC{a),l3,p) = { 



(19) 



/3(l-/3/p) 
[p/A 



To find C(a) in terms of known parameters, we use 
the explicit solutions of the TASEP for the current 

J {kC (a) , (3 , p) (Eq. 1 or 3) as indicated on the right- 
hand-side of Eq. 19. The exact solution Eq. 1 yields an 
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A'^ + 2 order equation in kC{a) which we solve numeri- 
cahy. Only one of the iV + 2 roots of Eq. 19 is real, yields 
occupations between zero and one, and is the physically 
relevant. The self-consistent solutions for fcC(a) are used 
to evaluate J(fcC(a), which are plotted in Figures 
6A, B. As expected, shorter chains yield slightly higher 
current. Larger D also increases the current and makes 
the approximate maximal current phase obtainable at 
smaller kCoc/p- Asymptotic limits for the current near 
phase boundaries and at large N are given in Appendix 
D. 

P = 0.75; R/a = 3 
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FIG. 6: The numerically determined, steady state currents at 
finite A'^. The self-consistent currents were found by numeri- 
cally finding the roots to the polynomial in J obtained by sub- 
stituting the last line of expression 19 into the exact equation 
1. (A) Steady-state currents as a function of the injection rate 
fcCoo/p for R/a = 3 and various D = A-iraD/k = 0.25, 1.0, 10. 
For D = 10, A'^ = 10 and N = 50 are compared. (B) J as a 
function of length N ior D = 1, 10 and kCoo/p = 0.3, 1. 



The numerical solutions depicted in Fig. 1 show, that 
for even modest N > 10, the currents are accurately de- 
scribed by their asymptotic expressions Eq. 3. Therefore, 
we can very accurately solve for kC{a) and steady-state 
ribosome currents by separately considering each phase 
and its associated asymptotic form of J. 

First assume that the detachment rate (3 > p/2 and 



consider the maximal current (phase III in the TASEP) 
where J = p/4. This occurs when both a,/3 > p/2. 
To determine the parameter regime in which J = p/A 
holds, we solve for C{a) and determine for what range of 
parameters a = kC{a) > p/2. Using J = p/A in Eq. 19, 
we find 



C{a) 



p/A + AttDRC^ 



(20) 



AnDR + Rk/a ' 
The criterion for maximal current, k > p/ (2C(a)), is thus 



k > 



p{ATTDR + k{R/a)) 
p/2 + 8ttDRCoo 



(21) 



Upon solving Eq. 21 for k, we find the minimum k = k* 
required to achieve maximal current J = p/A: 



kCo, 
P 



> 



k*Co 



P 



2- 



p 



(22) 



Note that for large enough p/ {ATraDCoo) the critical value 
k* can diverge. The divergence is more likely or larger 
R and occurs when there is simply not enough ribo- 
some nearby to provide a large enough "on" rate a to 
achieve maximal current. Even when the source (ter- 
mination end) is held at the initiation site (i? = a), 
there is the possibility that k*, and maximal current, 
are never attained. This behavior arises because even 
for ribosomes released at an infinitely absorbing spher- 
ical initiation surface, there is a probability of escape 
[Berg & Purcell 1977]. 

Next, let us consider small /3 and large a = kC{a). 
The mRNA has a high ribosome occupancy and a steady- 
state current J = /3(1 — P/p). This regime (phase II) 
is termination rate- limited and occurs for (3 < p/2 and 
(3<a = kC{a). Upon using J = /3(1 - 13 /p) in Eq. 19, 



P<kCia) = k^-^^m±^I^. (23) 

^ ' AivDR + kR/a ^ ' 

The only physical range of (3 that satisfies inequality 23 
is 



l3<l3*{k) = l[^{D + l)-l 



A{R/a)DkC^ 
p{{D + l)R/a-lY 



1 + 



(24) 



where D = AnaD/k. Equation 24 defines the phase 
boundary between the high-density, exit rate-limited 
phase (II) and the low-density, initiation rate-limited 
phase (I). This phase boundary is plotted as a function 
of kCoo/p for fixed AnaDCoc/p = 0.5 in Figs. 7B. In the 
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limit kCoo/p — > 0, the phase boundary straightens as in 
the standard TASEP and is approximately 



15 



P 



1 



(1 - a/R)k 
AwaDR 



0(fc2) 



(25) 



Finally, when /3 > P*{k), but the entrance rate kC{a) 
is low (< p/2), a low density phase with J = a(l — 
a/p) = fcC(o)(l — kC{a)/p) exists. The phase boundary 
delineating the low density phase (I) is defined by fc < A;* 
and P = P*{k). Upon using the current J = kC{a){l — 
kC{a)/p) in Eq. 19, we find kC{a) = , and the current 
in the initiation rate- limited phase (I): 



'1 + 



4:{R/a)DkC„ 



p{{D + l)R/a - 1)2 
In the limit p/{kCoo) oo, 



A-kDRCoo- 
(26) 



Jl{jp oo) 



AnaD + k{l 



-a/R) 

I kCoc 



{ATTaDf{k + AiTaD)kC^ 
{k{l-a/R)+ATTaDf \~ 



(27) 



+ o(p-'), 



which reduces to the result one would expect from in- 
finitely fast initiation site clearance. 

Summarizing, the large-A^ steady-state ribosomc cur- 
rents (given by Eq. 1) in terms of ribosome concentra- 
tions and kinetic "on" rates are 



(I) k<k*,p> kC{a) J=Jl = fcC(o)(l - kC{a)/p) 



(11) 



/3<|,/3<fcC(a) 



J= Jfl = /3(1-^) 
P 



(III) k>k*,p> 



J — 



P 
4' 



(28) 

where kC{a) in phase (I) is expressed in terms of known 
parameters according to Eq. 19. The mean occupa- 
tions of the initiation and termination sites, in each 
regime, can now be readily found. At the first site, 
cTi = 1 — J/{kC{a)), where we use J — Jl,(3{1 — fS/p), 
or p/A (currents associated with each phase), and kC{a) 
found from Eqs. 19, 23, or 20. Similarly, the occupa- 
tion at the last site is aN = J/P- All of our results 
can be expressed in terms of three of the four nondimen- 
sional parameters: D = AwaD/k, kCoc/p, AnaDCoc/p, 
and R/a. We shall present our results in terms of the 
relevant nondimensional parameters appropriate for the 
discussion at hand. 

Figure 7 A shows the critical value k* , above which 
an iV ^ oo TASEP is in the maximal current phase 
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FIG. 7: The modified phase diagram for translation rates 
along long (A'^ oo) mRNAs. [A) The minimum binding 
rate (Eq. 22) required to support the maximal current phase 
assuming that (5 > 1/2. This value depends on the bulk ri- 
bosomc concentration Coc and the distance R between the 
initiation and termination sites. (B) The modified phase dia- 
grams as functions of kCoc/p for AnaDC oo /p ~ 0.5 and vari- 
ous R/a. (C) Modified phase diagrams as functions of kCao/p 
for fixed D = ATraD/k and R/a = 10. 



(provided P/p > 1/2). When Coo is small and p is 

large, there is not enough ribosome in the cytoplasm to 
feed the initiation fast enough compared to the clear- 
ance rate p. Therefore the maximal current {J = p/A) 
arises only when the binding is efficient and fc > fc* is 
large. For smaller R (termination site close to the ini- 
tiation site), smaller values of AnaDCoa/p can still sup- 
port maximal current. From Eq. 22, we see that when 
AnaDC^/p < (1 - o/(2i?))/2, the critical value k* di- 
verges and the maximal current can never be reached. 
There is simply not enough ribosomes or the diffusion is 
too slow for there to be sufficient concentration at the 
initiation site to support the maximal current phase. 
If the diffusion constant D and Coo are chosen such 
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that, for example, AiraDCoo/p is small, the critical val- 
ues k* vary considerably with R/a, as shown by the 
green points {AnaDCoc/p = 1/2) in Fig. 7 A. The 
effects of depletion arise suddenly, with onset only at 
values of AnaDCoc/p < 0.6. For large R/a, values 
of AnaDCoo/p ^ 0.5 will render the critical k* val- 
ues very sensitive to R. If the initiation site has an 
interaction size of o ~ lOnm, and p ~ 2 — 3/s (20- 
30 codons/s) [Kruger et al. 1998], a diffusion constant 
of D ~ 10~^ cm^/s requires an effective concentra- 
tion of Coo ~ 0.01 — 0.02/iM for the phase diagram to 
be sensitive to diffusional depletion and R. Although 
typical total cytoplasmic ribosome concentrations are 
Coo ^ 1/uM, many components must assemble in order 
to activate a translation-viable ribosome. For example, 
eIF4F exists at 0.01-0.2 times the total ribosome concen- 
tration [Duncan et al. 1987]. Furthermore, this already 
low abundance of elF often needs to be further phospho- 
rylated to be active. Thus, the effective concentrations 
Coo (and even diffusion constants) appropriate for our 
model may very well be low enough to fall within the 
range for the phase boundaries to be extremely sensitive 
to diffusional effects. 

Figures 7B,C show the steady-state phase diagrams 
as functions oi (3/p and effective binding rate kCoo/p- 
In these phase diagrams, as in the imperturbcd ones de- 
fined by Eq. 3, the upper left region corresponds to a 
low density phase, the lower right region corresponds to 
a high density phase, and the upper right region describes 
a half-occupied (except near the ends i = I, N), maximal 
current phase. The current J is constant throughout the 
maximal current phase and is not changed if kCoc/p or /3 
is increased beyond k*Coo/p and 1/2, respectively. The 
phase diagram is modified by ribosome diffusion and de- 
pletion near the initiation site. The unmodified phase 
boundary between phases (I) and (II) of the TASEP (Eq. 
3) would simply be defined by the straight line segment 
P/p = kCoo/p- The main effects of diffusional depletion 
(by the initiation sink) and replenishment (by the termi- 
nation source) on the standard phase diagram Fig. 3 is 
to shift the low density-maximal current phase bound- 
ary to larger effective injection rates kC'ac/p and bend 
the low density-high density phase boundaries accord- 
ingly. Figure IB depicts the phase boundaries defined 
by Eqs. 22 and 24 for fixed R/a = 3/2,4, 10, oo, and 
fixed AnaDCoo/p = 1/2 as indicated by the green points 
in Fig. 7 A. In this example k*Coolp = 3/2,4,10 for 
R/a — 3/2,4, 10, respectively. Note that for R/a oo 
that k* diverges and the maximal current phase is never 
attained. If inaDCoo/p < 1/2, then there will be a finite 
value of R/a such that k* diverges. 

If, instead, D = AnaD/k is held fixed, the phase 
boundaries are nearly straight, as shown in Fig. 7C. 
Here, wc fixed R/a = 10, and plotted the phase dia- 
grams for D = 0.05,0.1,0.5,3. The corresponding val- 
ues of kCoc/p above which the maximal current phase 
is attained are k*C^/p = (1/2)(1 + (1 - a/{2R))/D = 
10,21/4,29/20, and 79/120, respectively. 



Our results up to this point arc contingent on the 
fact that measurements arc averaged over time scales 
such that the TASEP and the diffusion processes have 
reached steady-state, and the mRNA chain distribution 
has thermally equilibrated. The possibility exists that 
the chain conformations are not in thermodynamic equi- 
librium while the TASEP and the bidk ribosome diffu- 
sion has reached steady-state for a given chain confor- 
mation. Thus, although not relevant within each of the 
three well-defined physical processes, the issue of kinetic 
versus thermodynamic control of ribosome throughput 
arises when one considers measurements over time scales 
that are insufficient to allow equilibration of the mRNA 
chain. The consequences of this are discussed in the fol- 
lowing section. 

EXPERIMENTAL CONSEQUENCES AND 
PROPOSED MEASUREMENTS 

The basic physical mechanisms described in our model 

for mRNA translation suggest a number of experimental 
tests. However, it must be emphasized that the model 
is meant to provide qualitative guidelines most useful for 
studying trends and how they depend on physical param- 
eters. Translation occurring in vivo involve too many 
molecular species and biochemical processes to be quan- 
titatively modeled, especially in the absence of signifi- 
cantly more detailed experimental findings. Nonetheless, 
OMV proposed mechanisms can be probed with carefully 
designed, simplified, in vitro experiments. Here, we dis- 
cuss in detail the basic expected phenomena and their 
regimes of validity. 

First note from Figure 6 and from Appendix D that 
the exact currents for a finite number of codons N very 
rapidly approach the asymptotic values given by Eq. 3 as 
N increases. Even when N is only ~ 10 — 50, the steady- 
state ribosome currents are only a few percent off the 
exact N = oo results. In other words, the exact solution 
Eq. 1 is a very good approximation to Eq. 3 for N > 10. 
Therefore, as a mental guide, it is typically sufficient to 
consider the currents J corresponding to an infinite chain 
(A'' = oo) given by Eq. 3, but nonetheless consider a 
finite initiation-termination separation (measured by the 
harmonic distance R). 

Polysomal Density Variations 

Although we have focused on the steady-state current, 
the particle (ribosome) densities in each of the three cur- 
rent regimes are different and may be detected. In the 
TASEP model, the ribosome density profiles along the 
mRNA chain vary only near the initiation and termina- 
tion ends. In the interior of the mRNA, the density is 
relatively uniform and are given by the last column in 
Eqs. 3. In the exit-rate limited phase (small P/p), where 
J = /3(1 — (3/p), the midpoint density crjv/2 ~ 1 — P/p is 
high, while in the low injection rate case, J = a{l — a/p), 
and (T;v/2 ^ a/p is low. The typical density in the max- 
imal current regime is a ~ 1/2. These densities are also 
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approximately correct when one explicitly treats large ri- 
bosomes that occlude many codon "lattice sites." There- 
fore, we might expect that one may be able to predict 
in which current regime translating mRNA exists if ri- 
bosome densities can be estimated from images taken 
with e.g. AFM or EAI techniques. For example, in Fig- 
ure lA, the high density of ribosomes suggests that the 
system is in phase (II) where the steady-state current 
J — — P/p) is a function only of the detachment rate 
(i. 

Kinetic Binding Rate and Ribosome Concentra- 
tion Dependences 

Figure 7C shows the minimum effective attachment rate 
k*Coo/p necessary for a large system to be in the maximal 
current regime (where the ribosome current J « p/4) as a 
function of the effective ribosome diffusion constant. An 
additional requirement is that the effective detachment 
rate > 1/2. The value of k* can be tuned perhaps 
by substitution of the codons comprising the initiation 
sites, or by other physical means. Although ribosome 
diffusion constants are difficult to vary over a wide range 
(by modifying the solution viscosity), the critical k* is 
a very sensitive function of D, particularly for small D. 
It is thus possible that slightly increasing the ribosome 
diffusivity can dramatically decrease the k* necessary for 
the system to be in the maximal current regime. 

As mentioned, changing the mRNA length N does not 
significantly affect the overall steady-state current along 
the chain (beyond about A/' ~ 10 — 20) but it can change 
the statistics of the initiation-termination separation by 
changing R. Increasing the harmonic separation R has 
qualitatively the same effect as decreasing the ribosome 
diffusivity, since terminated ribosomes now have further 
to diffuse back to the initiation site. For 



D < 



p{l - a/{2R)) 
SttoCoo 



(29) 



the maximal current regime is never reached. This can 
be easily seen from equation 22. Thus, rather than tun- 
ing the ribosome diffusivity, decreasing Coo may preclude 
the system from entering the maximal current phase if 
Eq. 29 is satisfied. There is simply not enough ribosome 
available for sufficient initiation to be achieved so that 
the maximal current phase arises. 

When inequality 29 is not satisfied, the maximal cur- 
rent phase can exist. In Figure 8 A, we replot the phase 
diagram corresponding to R/a = 10 shown in Fig. 7. 
Fixing the parameter AwaDCoo/p = 0.6 allows k to be 
the only free parameter. This kinetic "on" rate k can be 
tuned by varying ribosome recruitment proteins such as 
eIF4E. If /3/p > 1/2, Cao,D, and p are held constant, 
increasing k from a sufficiently small value allows one to 
traverse the trajectory 5*1. The steady-state ribosome 
current starts in the low density phase (I) with current 
given by Eq. 26. As k is increased, the steady-state 
current increases until it continuously crosses over into 




FIG. 8: Laxge N phase diagrams for R/a = 10. (A) Phase 
diagram for fixed A-waDCoo/p = 0.6 with trajectories Si,2 
corresponding to increasing kinetic "on" rate k. (B) Phase 
diagram when D = AiraD/k = 1,0.25 is fixed, and trajecto- 
ries 5*3,4 correspond to increasing bulk ribosome concentration 
Coo- Trajectory S3 traverses the (I)- (III) phase boundary for 
D = 0.25 (thick curves) but not for D = 1.0 (thin curves). 
Trajectory 5*4 on the other hand, traverses the (l)-(II) phase 
boundaries for both D = 0.25, 1.0. 



the maximal current regime (III), where the ribosome 
throughput is given by J = p/4. Further increasing 
k when inside the maximal current phase (III) will no 
longer affect the steady-state ribosome current. If, how- 
ever, P/p < 1/2, the current behavior abruptly crosses 
over (along trajectory S2) from that given by Eq. 26 
to J = /3(1 — P/p) corresponding to the high ribosome 
density phase (II). In this phase the detachment step is 
rate-limiting, and further increases in k will no longer 
affect the throughput. 

If k is held fixed and the ribosome concentration is in- 
dependently varied instead, it is more instructive to plot 
the phase diagram for fixed D = AnaD/k and i?/a, as 
shown in Fig. SB. Here, we choose the representative 
values R/a = 10 and D = AwaD/k = 0.25, 1 and moti- 
vate parameter trajectories obtained by varying only Coo • 
For P/p > 1/2, increasing the bulk ribosome concentra- 
tion traces out the trajectory S3 continuously from the 
low density phase (I) (Eq. 26) to the maximal current 
(J = p/4) phase. Further increasing the concentration 
well into the maximal currc;nt phase will no longer af- 
fect the throughput. Similarly, if f}/p < 1/2, increasing 
Coo can shift the behavior from that of the low density 
phase to that of the high density, exit rate- limited phase. 
Alternatively, one may vary p, the mean elongation rate 
of individual ribosomes, by controlling the tRNA con- 
centration in solution. For example, decreasing available 
tRNA will move the system from the lower left to up- 
per right in Fig. 8B, eventually reaching a steady-state 
current J = p/4. 

Despite the apparent fundamental importance of the 
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kinetic binding, or "on" rate in translation, there are no 
systematic and independent measurements of k in the ht- 
erature. The required independent estimates of k may be 
achieved by perhaps combined kinetic and affinity mea- 
surements of the association of a minimal set of compo- 
nents, including only the ribosomes and a portion of the 
5' initiation codons and cofactors. For the off rate /3, 
similar ideas can be employed. The tRNA or ribosome 
release factor concentrations for the last codon can also 
be adjusted to tunc the off rate (3. 

Codon and UTR Length Dependences 

In experiments where it is possible to vary the number of 
codons N. the typical harmonic distance R can also be 
tuned. The phase diagrams in Figs. 3, 7, and 8 all corre- 
spond to different regimes of Eq. 1 in the large N limit. 
In practice Eq. 1, is no longer sensitive to TV for N > 10; 
however, the harmonic distance R between initiation and 
termination sites continues to increase as VN, affecting 
the local concentration C(a), and thus the effective pa- 
rameter a = kC{a) in Eq. 1. As shown in Fig. 7C, 
increasing R/a shifts the phase boundaries to the right, 
making the maximal current phase (III) harder to attain 
unless k or Coo is concomitantly increased. However, 
due to the \fN dependence, this effect would be rela- 
tively weak for all but enormous values of N. Hence we 
have chosen the qualitatively reasonable value R/a= 10 
in Figs. 8A, B. 

Although there may be a weak increase in R/a as one 
increases the mRNA length, the effects of increasing the 
coded sections [N) or the noncoded sections (the im- 
translated regions m,n), can be different depending on 
Uq. For large Uq, looped configurations dominate and 
the distance between initiation and termination sites will 
be more sensitive to m-|-n, the shortest distance between 
them (cf. Fig. 4B). The effect of lengthening m + n on 
R/a in the high Uo regime is clearly shown in Fig. 5A. 
For small Uq, open configurations dominate and the short 
segments m and n at the two ends do very little to affect 
R/a relative to N. Thus, although length dependences 
are expected to be weak, increasing the codon length 
N would more likely increase R/a (and hence decrease 
throughput J) in the small Uq, or repulsive limit. Con- 
versely, increasing m,n would more likely increase R/a 
when Uq is large and loops dominate the mRNA confor- 
mations. 

Initiation- Termination Cooperative Effects 

We have so far considered only the effects of the bind- 
ing energy Uq on loop formation, 1/R, and the re- 
sulting local ribosome concentration at the initiation 
site. However, evidence suggest that contact between 
elongation factor proteins and/or poly (A) tail pro- 
teins can enhance or suppress the kinetic binding rates 
k through direct molecular contact and cooperativity 
[Jackson 1996, Munroe & Jacobson 1990, Sachs 1997, 
Sachs 2000]. There is the possibility that in looped 
states, PAB's can interact with initiation machinery and 
modify k, and/or elongation factors can assist or hinder 



detachment of ribosomes at termination. Modification of 
k and/or (3 through direct contact between proteins as- 
sociated near the initiation and termination sites may be 
an additional mechanism by which translation rates can 
span the regimes shown in Figs. 7B, C and 8. Qualita- 
tively, the experimental finding that contact between the 
mRNA ends affect the initiation or possibly termination 
processes can be modeled by assuming effective "on" or 
"off" rates 

^eff[Uo\ = A;o(l - Ploop) + kiPloop 

(30) 

PeffPo] = /3o(l - Ploop) + PlPloop, 

where ko,f3o and fci,/3i are the binding and "off" rates 
when the mRNA is open and looped, respectively. As 
Uq is varied, both the intrinsic rates as well as the sink- 
source separation R are modified. Using expression 30 for 
k and /3 in equations 22 and 24, the dependence of J on 
the binding energy Uq can be mapped. A number of qual- 
itatively different scenarios are possible. If /3o = /3i but 
ki > /cq, the current is a monotonically increasing func- 
tion of Uo because the binding rate increases and the ribo- 
some source (3' terminus) is brought closer. Both of these 
effects monotonically increase the steady-state current. 
However, if for fixed /?, ki < ko, then these two effects 
can partially balance each other and there is the possi- 
bility of a maximum in J{Uo). A maximum occurs when 
initially, as Uq is increased, the decrement in kgff cannot 
keep up with the enhancement in local ribosome concen- 
tration due to the increasing likelihood of loop formation 
(i.e. the shifting of the high current phase boundary to 
lower keff )- However, if ki is sufficiently small, kgff even- 
tually diminishes, such that one arrives at the low den- 
sity, low current regime. These effects are illustrated in 
the sequence of figures 9A — C. The steady-state current, 
self-consistently calculated from Eqs. 1, 19, and 30, has a 
possible maximum and is shown as a function of Uq in fig- 
ure 9D. Here, we have chosen koCoo/p = 50, kiC^o/p = 
0.3, [3 = 0.75, = 100, m = m = 30, e = 0.2, a = I, and 
5 = 0.1. Only certain sets of parameters permit a max- 
imum. Small values of AnaDCac/p and large N result 
in the largest maxima. For large values of AnaDCoo/p, 
diffusion is fast, local ribosome concentrations are not 
significantly depleted by the initiation site, and the high 
current regime is already pushed to low values of kCoo/p- 
Therefore, increasing Uq and decreasing R does not fur- 
ther drive the high current regime towards significantly 
lower kCoo/p- For essentially the same reason, smaller 
N enhance ribosome recycling, increasing the current at 
low [/qi thereby rendering the maximum in J to lower 
values of Uq. As illustrated in the exampled given in fig- 
ure 9D, increases of ~ 50 — 60% above the background 
currentare possible as Uq is varied. Thus, we see that the 
two processes, direct molecular catalysis of initiation and 
termination, and ribosome diffusional depletion, balance 
each other and may provide delicate control mechanisms 
during later stages of gene regulation. 
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FIG. 9: The current (Eq. 1) as a function of Uo when the 
ribosome "on" rate k can be modified by direct interactions 
with elongation factor and PAB proteins. The Gaussian chain 

approximation is used with persistence length (. — a. {A — C) 
show hypothetical, qualitative trajectories in the presence of 
a changing phase diagram. As Uo is increased, R decreases. 
With AixaDCoo/p = 0.6 fixed, the phase boundaries shown in 
A — C correspond to R/a = 25,3,3/2, respectively. In ad- 
dition, if fc() > fei, the effective binding rate kegCoo/p also 
decreases with increasing Uo, resulting in the trajectories in- 
dicated by the dot. (D) Currents for koCo^/p — 50 and 
kiCoo/p = 0.3 and A'^ = 100. The weak maximum appears 
only for small AnaDCoo/p- 



Kinetic vs. Thermodynamic Control 

Finally, we point out that our analysis has been con- 
fined to the steady-state (for the bulk ribosome diffusion 
and individual ribosome movement along the mRNA) 
and thermodynamic equilibrium (for the statistics of the 
polymer statistics). Since it is possible for diffusion and 
ribosome elongation along the mRNA to reach steady- 
state before the mRNA chain reaches conformational 
equilibrium (in the presence of loop- forming proteins), a 



possibility exists for "kinetic versus thermodynamic con- 
trol" for the measured ribosome throughput. Although 
the loop-binding energy Uq determines the equilibrium 
distribution of open and closed mRNA conformations via 
Pioop, the kinetics of loop opening and closing are deter- 
mined by energy activation barriers of the loop binding 
proteins. For example, if the activation energy for creat- 
ing a looped state is high, the mRNA may sample only 
unloopcd conformations on time scales of the "steady- 
state" (with respect to the TASEP and diffusion). In 
this scenario, the effect of the loop binding protein does 
not arise and the harmonic distance (i?) would appear 
to be that associated with an open chain (Uq — oo in 
Fig. 5A,B). Conversely, if the the mRNA chain hap- 
pens to be in a looped conformation and the free en- 
ergy barrier for dissociation of the loop is large, the mea- 
sured current may be that corresponding to only a closed 
mRNA loop (mimicking the case Uq oo). This is likely 



to occur if the measurement time r <C r^i, 



De 



-u* 



where Tdiss is the spontaneous dissociation time (or the 
Kramers escape time) and U* is the activation barrier 
energy/(fcBT). The activation energy U* depends on 
the specific molecular details of the loop-forming pro- 
teins; however, measurements using fluorescence quench- 
ing can be used to independently determine the distri- 
bution of times the mRNA chain is looped or unlooped 
[Goddard et al. 2000]. Only when Uq or U* are large 
does ribosome recycling get significantly enhanced by 
loop formation. Transient measurements, as well as fluc- 
tuations of the measured throughput, is beyond the scope 
of the paper. 

SUMMARY 

We have constructed a simple model and road map for 

the possible physical effects at play during translation. 
The model incorporates driven diffusive motion which 
obeys exclusion statistics for ribosomes along mRNA. 
The initiation and termination sites are considered as 
sinks and sources of ribosome concentration, described 
by the steady-state diffusion equation (Laplace's equa- 
tion). The average conformations of the mRNA chain 
define the typical initiation-termination distance which 
determines the how terminated ribosomes directly dif- 
fuse back to the initiation site and affect the local con- 
centration there. This local concentration is a parameter 
(the injection rate) in the exclusion process, but also de- 
pends on the overall ribosome throughput (the strength 
of the sink and source). Thus, the current J needs to 
be solved self-consistently. Direct cooperative enhance- 
ment of kinetic binding and "off" rates were also incor- 
porated. Although it is thought that the rate-limiting 
step is binding and initiation of ribosomes at the initi- 
ation site [Clemens 1996, Mathews et al. 1996], the fact 
that polysomes have been found to exist in both high 
and low ribosome occupancy states suggests that under 
physiological conditions, steady-state ribosome fluxes can 
span the regimes defined by the phase diagrams depicted 
Figs. 3 and IB, C. At high occupancy, the rate limit- 



15 



ing step is the off rate /? which controls the steady-state 
flux (cf. Phase (II) in Fig. 3). Ribosome depletion by 
the sink and replenishment by the source can drastically 
affect the constant k, (3 phase diagram, as shown in Fig- 
ures 7. The critical values of k*Cao/p that define the 
the left boundary of the maximal current phase (in the 
N ^ 00 limit) is most sensitive to the dimensionless pa- 
rameter AiraDCoo/p when AiraDCao/p ~ 0.15 — 0.3. For 
sufficiently small AnaDCoo/p, the effective injection rate 
cannot reach 1/2 and the maximal current phase cannot 
be attained. When N ^ oo, the explicit currents were 
computed from Eq. 1 and plotted in Figure 6. Given the 
possibility of cooperative interactions in looped mRNA 
configurations, we have also found a maximum in ribo- 
some throughput as a function of loop-binding energy 
Uo. 

Many molecular and chemical details have been ne- 
glected. As mentioned, we have ignored the fact that 
numerous components must assemble before initiation 
and have modelled only an "effective" rate-limiting com- 
ponent. The surface concentration parameter C(a) in 
our model would be an effective concentration refiecting 
the local density of ribosomes capable of initiation. Pro- 
posed mechanisms of ribosome scanning [Jackson 1996], 
whereby ribosomes attach to segments of mRNA and 
undergo one-dimensional diffusion before encountering 
the initiation site, can be adequately modeled with the 
present approach if one assumes that the rate-limiting 
step is initial adsorption onto an mRNA segment. Fur- 
thermore, we have assumed that the ribosomes do not 
detach from the mRNA until they reach the termination 
site and that their forward hopping rates are uniform 
across the whole coding region. Finally, in our simple 
polymer model, we have neglected both self- avoidance 
(of both chain-chain and chain-ribosome exclusion) and 
the fact that the effective persistence length may vary- 
ing along the mRNA, depending on the local ribosome 
density. 

Despite these simplifying assumptions, we find that 
qualitatively, subtle control mechanisms can come into 
play, depending on biologically reasonable physical pa- 
rameters. Although there are numerous experiments 
probing translation, both in vivo and in vitro, many dif- 
ferent systems and physical conditions are employed, ren- 
dering quantitative comparison with measurements diffi- 
cult. Nonetheless, our model suggests new measurements 
that can be used to qualitatively probe the various physi- 
cal hypotheses and exhibit our predicted physical trends. 
For example, the effective Coo can be varied in a number 
of ways to test with the predicted current regimes. Occu- 
pancy along the mRNA can also be correlated with the 
high, low, and intermediate density phases. Addition- 
ally, the noncoded regions between the elongation factors 
and the initiation site, and the termination site and the 
poly(A) tail-bound PAB can be varied to test possible co- 
operative interactions defined by Eq. 30. Since the loop 
formation probability Pioop depends on the total statis- 
tical length Lt, which is dominated by the length of the 



coding region (Lno^ ^ (m + n)e^), varying m and n 
would affect, through the likelihood of molecular contact 
in the looped states, only keff and (3eff, respectively. The 
actual probability of loop formation Pioop, and hence R, 
would not be significantly affected. Chemical modifica- 
tion of the elongation factors or the PAB's would affect 
Uo, and hence keff,Peff, and R through Pioop- Using mi- 
cromanipulation techniques [Bustamante et al. 2000], it 
might also be possible to fix the initiation-termination 
distance in vitro. 

Numerous extensions to the presented models can be 
straightforwardly incorporated to more precisely model 
the chemical and microphysical processes. Codon and 
tRNA concentration-dependent variations in the inter- 
nal transition rates p [Kruger et al. 1998], as well as 
random detachment processes, can be implemented us- 
ing simple lattice simulations. Sites along the mRNA 
chain at which ribosomes pause can be treated as "de- 
fects" in a TASEP and the whole process can be treated 
with mean-field theory [Kolomeisky 1988]. Multiple cod- 
ing regions in prokaryotic translation (Shine-Dalgarno 
sequences) can be modeled as a sequence of initiation 
(sinks) and termination (sources) sites. Similarly, cap- 
independent initiation at internal ribosome entry sites 
[Jackson 1996, Marti'nez-Salas et al. 2001, Sachs 1997] 
(IRES) can also be treated as sinks within our basic 
model. Translation of ER-associated mRNA further in- 
volve ribosomes that attach the mRNA at certain points 
on the ER membrane. In this case, one expects that den- 
sity of cytoplasmic and ER-bound ribosomes to have a 
strong effect on localization of mRNA to ER and over- 
all translation rates. One can also consider cases where 
the protein product itself is a ribosome product neces- 
sary for its self-translation; this processes would result 
in initially autocatalytic protein production. Although 
these more complicated and interesting extensions have 
not been considered here, the simple models we have pre- 
sented represent a first step towards the rich problem of 
identifying and quantifying the physical and biological 
mechanisms that control late stages of expression. 
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9804370 and DMS-0206733. 



APPENDIX A: PHYSICAL ASSUMPTIONS AND 
MATHEMATICAL APROXIMATIONS 

Although our model arrives at a number of conclusions 
that are developed by combining three different physi- 
cal theories, the assumptions and approximations used in 
each are well developed in the condensed matter physics 
and biophysics literature. Here, we summarize the main 
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physical assumptions and review the mathematical ap- 
proximations used: 

• Steady-state and equilibrium assumptions: Ribosome 
diffusion and motion along the mRNA are treated within 
steady state, while the configurational distribution of the 
mRNA polymer is not directly coupled to ribosome dif- 
fusion or motion, and is considered in thermodynamic 
equilibrium. The inverse "harmonic distance" 1/i? is 
determined from equilibrium mRNA configurational dis- 
tributions, but parametrically influence the nonequilib- 
rium steady-state processes of diffusion and the TASEP. 
Equilibration times of unentangled polymers and diffu- 
sion times over the length of the mRNA arc on the order 
of milliseconds to seconds, while the relaxation to steady- 
states in the TASEP occur over seconds to on the order 
of a couple minutes. Thus, on experimental time scales 
longer than these, transients in the ribosome throughput 
have dissipated, and the steady-state and equilibrium as- 
sumptions are appropriate. 

One might be tempted to formulate the specific mech- 
anisms in terms of the common notions of reactions be- 
ing kinetically or thermodynamically controlled. In this 
biochemical terminology, the TASEP is kinetically con- 
trolled, since the ribosomes take irreversible steps as each 
amino acid is added during elongation. The mRNA con- 
figurations, computed under equilibrium conditions, are 
by definition thermodynamically controlled. However, 
each of the proposed mechanisms is a simple, single, in- 
dependent process, the notion of kinetic control versus 
thermodynamic control is irrelevant. Within each mech- 
anism, there are no alternate "reaction paths" or out- 
comes for kinetic or thermodynamic control to apply. 
However, it is possible that the mRNA conformations 
and the binding protein-mediated loop formation does 
not reach equilibrium on the time scale of measurements 
of ribosome throughput. This possibility is also discussed 
in the Experimental Consequences and Proposed Mea- 
surements section. 

• Gaussian chain polymer model for mRNA: Unlike 
tRNA, the coding regions of mRNA is relatively devoid 
of secondary structure. The single-stranded mRNA is 
treated using standard statistical physics of polymers 
that assumes nonintersecting random walks of step size 
defined by the polymer persistence length. For single- 
stranded mRNA without adsorbed proteins, the persis- 
tence length w 2 — 3 bases. When loaded with large ri- 
bosomes, we assume that the persistence length is on the 
order of the ribosome size and that it is approximately 
uniform along the chain. Although the ribosome load- 
ing might varying slightly along the chain, this variation 
occurs only near the ends and does not appreciably af- 
fect the equilibrium end-to-end distributions. Although 
we treat only phantom (nonintersecting) polymers, ef- 
fects due to the binding of finite-sized PABs and cap 
proteins are explicitly treated when computing the end- 
to-end distribution functions in the small distance regime 
where steric exclusion of the end proteins are important. 



• Single component "ribosomes": The assembly of ribo- 
somes before or during adsorption onto the initiation site 
can be modeled as an effectively single, rate- limiting com- 
ponent that undergoes standard diffusion in the bulk so- 
lution. Including more chemical details will not qualita- 
tively alter our results, since in diffusive steady-state, all 
species' concentrations would be spatially distributed as 
1/R and parametrically affect the TASEP in the same 
qualitative manner. 

• Equal particle and step sizes: Ribosomes moving along 
mRNA are treated with a discrete TASEP where the step 
size is exactly equal to the particle diameter. However, 
ribosomes are large and occlude ^ 10 codons so that 
they move one particle diameter only after about g = 10 
steps (amino acid transfers). Nonetheless, the qualitative 
behavior of the currents for different q remain unchanged. 
For the sake of simplicity and clear analytic expressions 
(Eqs. 22, 24, and 26), we have restricted our analysis 
to q = 1. Exact large N asymptotic expressions for the 
steady state current for general q are given in Appendix 
D. 

• Uniform elongation step rates in the TASEP: The 
analytic solutions represented by Eqs. 1, 2, and 3 
are based on uniform elongation rates p along the 
mRNA. It is known that p can vary by factors of 2-10 
[Krugcr et al. 1998], depending on the codon in question 
and the availability of the associated tRNA. As a first 
step, we have simply assumed a scenario in which the 
elongation rates do not vary appreciably over the cod- 
ing region. More elaborate models that include specified 
elongation rates pi across the mRNA chain would require 
extensive simulations for each realization of {pi}. 

• Bulk diffusion limited adsorption: Ribosomes, or the 
relevant rate-limiting component of a ribosome. dif- 
fuses in bulk and directly attaches to the initiation 
site. Capture of the ribosome by the initiation end of 
the mRNA may occur in a two-step process of non- 
specific adsorption from bulk, followed by linear dif- 
fusion along a segment of the mRNA, before ulti- 
mately interacting specifically with the initiation site 
[von Hippel & Berg, Stanford a/. 2000]. Although 
studied in the context of linear diffusion and search 
along DNA [Berg & Purcell 1977], direct evidence for 
such scanning mechanisms in the initiation of mRNA 
translation has been hard to obtain [Jackson 1996]. For 
example, secondary structure in the form of small mRNA 
knots near the 5' region must be melted before efficient 
ribosome scanning can occur [Kozak 1989]. Neverthe- 
less, one-dimensional diffusion of ribosomes along the 
mRNA near the initiation sight is implicitly included in 
our model. The conjectured scanning mechanisms sug- 
gest that ribosomes scan locally near the start codon 
[Jackson 1996, Wang et al. 1997]. Thus, if ribosome re- 
cycling via diffusion through the bulk is rate- limiting, the 
scanning region near the initiation where the linear diffu- 
sion occurs can be considered as binding region of larger 
effective capture radius a. 
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APPENDIX B: MEAN FIELD ANALYSIS FOR 
LARGE PARTICLES 



Consider identical particles that are driven through a 
long one-dimensional lattice of L sites. The lattice is dis- 
cretized into steps of unit length (a step size correspond- 
ing to a codon step), while the particles are of integer 
size q> \. For each particle to move a distance roughly 
equal to its diameter, q consecutive steps must be taken. 
Thus, we expect that effectively, the mean current would 
be approximately described by equations 1 or 3 but with 
p replaced hy p/q. A mean field model for the asymmet- 
ric exclusion process containing particles that occupy q 
substrate lattice sites (mRNA codons) has been solved. 
The analysis is beyond the scope of this paper, but the re- 
sulting steady-state currents follow the same qualitative 
"phase diagram" (Fig. 3) as the TASEP with particles 
of size q — 1. That is, for large entrance and exit rates, 
there is a maximal current phase (III), bounded by low 
(I) and high (II) density phases. The effects of increasing 
the particle size to g > 1 only quantitatively changes the 
values of the currents in each of these phases, and can be 
straightforwardly integrated into the present study. 

The general (for all particle sizes q) result for the 
steady-state currents in the infinite chain length limit 
are 




FIG. 10: Schematic of the geometry near the initiation- 
termination end of a looped mRNA. The mRNA loop binding 
factors are shown in yellow and black, while a ribosome of ra- 
dius a is situated at the initiation site (not drawn to scale), m 
and n correspond to the number of bases of the UTR's which 
are assumed to be relatively protein-free and have short per- 
sistence length £. Here, the persistence lengths in the coding 
regions (thick curve, described by the TASEP) is £ ~ a. 



APPENDIX C: OPEN CHAIN PROBABILITY 
DISTRIBUTIONS 
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These results have been verified to be exact (to within 
numerical precision) by extensive Monte-Carlo simula- 
tions. Note that for large g, the maximal current Jmax 
is that given by Eq. 3 but with p p/q. These results 
only serve to quantitatively shift the phase boundaries 
between the different current regimes and decrease the 
magnitude of the currents. For example, ii q — 2, 3, the 
phase boundary between the low density and the max- 
imal current regime occurs at a/p — 0.41,0.37, respec- 
tively, rather than at 0.5. For the sake of simplicity and 
manageable algebraic expressions, we have in this study 
only considered the q = 1 case. Our analysis should be 
applied to the mRNA translation problem with the un- 
derstanding that p in Eq. 3 and subsequent equations 
is roughly the rate for a ribosome to move its molecular 
size, not the rate for an individual tRNA transfer. If, 
however, the above expressions were used, then p in ex- 
pressions Bl would be identified with the typical single 
amino acid transfer rate. 



Consider the probability distribution VF(r|open) of 
the initiation-termination separation in the absence of 
loop formation. Since the ribosome can be much larger 
than the typical persistence length in the noncoding re- 
gion of single-stranded mRNA, a 3> £. For a ^ lOe, 
a Lmn, unless the noncoding regions are very long, 
with TO + n 3> 100. For shorter noncoding regions, the 
expression for H^(r; Lm„|open) must be evaluated more 
carefully, particularly for small r, in order to compute 
J drM^(r)/r correctly. Assume the termination site starts 
a random walk from any position on the sphere. Details 
of the different segments of mRNA are shown in figure 10. 
The problem maps to that of heat diffusion from a sphere 
of size a with reflecting boundary conditions and an in- 
stantaneous uniform temperature source on the surface. 
The probability that the initiation site (that is linked 
to the termination site via m -\- n persistence lengths) is 
within r of the sphere can also be described by the tem- 
perature near a sphere with an exterior instantaneous 
source of temperature. The diffusion equation for the 
probability distribution 1/F(r; Lm„|open) = W obeys 



W{v,t) = KAVF(r,i) 



(CI) 



where the thermal conductivity is associated with the 
squared persistence length, n ^ and time corre- 

sponds to the length t m + n. The initial and bound- 
ary conditions corresponding to a chain that originates 
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from the surface of the otherwise impenetrable ribosome 
particle are 



The initiation-termination distance distributions can be 
estimated using 



drW{r = a)=0, W{r, t = 0) = (C2) 

where we have assumed spherical symmetry. Following 
Carslaw and Jaeger (1959), we define W{r,t) = f{r,t)/r 
to reduce (CI) to f{r,t) = d^f{r,t), with boundary con- 
ditions 



dj{r = a) = ^f{a), f{r, t = 0) = (C3) 

The solution for f{r,t) is found using Laplace trans- 
forms, and is 



fir,t) = 
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(C8) 



The probability distribution is thus 



This end-to-end probability distribution from both FJC 
and WLC models are plotted in Figs. 11^, B. The WLC 
model gives qualitatively similar distributions to those of 
the FJC model, provided the contour length is appropri- 
ately reduced. Furthermore, the WLC and FJC models 
provide qualitatively similar averages (a/r) if the A'' used 
in the WLC is sufficiently reduced. Upon using Eqs. 15 
and 16, one can compute the effective end-to-end distri- 
bution of a chain with segments of different persistence 
length and with attached loop binding proteins, as shown 
in Fig. lie. 



APPENDIX D: ASYMPTOTICS FOR Jn 



Wir, Ljopen) = 



^g-3(r-o)V(2JV^=) gr/o-l 



gJV^V(6a^)Erfc 



3 r-a 



(C5) 



Note that ifa/L<^ 1, as it is for L = L^, equation (C5) 
would be approximately 



W{r, i|open) 



/ 3 



3/2 



g-3rV(2L=)^ 



36|,(1 



x2\ ^/a4 



(C6) 

which reduces to end-to-end probability distribution for 

a Gaussian random chain. However, since a/L„m 1, 
we need to use the full expression Eq. C5 for the loop 
contribution (Eq. 15) in the calculation of Weff{r) and 
1/R. 

For the WLC, an approximate probability distribu- 
tion function can be; reconstructed from commonly used 
phenomenological force-extension relationships. If the 
force-extension interpolation given by Marko and Siggia 
[Marko & Siggia 1995] is shifted to take into account the 
finite-sized origin. 



4 1 



z — a ^ 
Nl ) 



m 



(C7) 



Asymptotic expressions for the steady-state current 
given by Derrida et al. [Derrida et al. 1993] are valid 
only far from the phase boundaries. However, in our 
present model, we are interested in how a change in the 
mRNA length N allows the system to cross over from 
one behavior to another. For the sake of completeness, 
we derive limiting forms for the current Jj^ near phase 
boundaries. An asymptotic expansion in the rates about 
a = 1/2 is taken first, with iV fixed. From the exact 
expression Eq. 2 given by [Derrida et al. 1993], we find 
the following asymptotic expansion 



c,/^_oA ^ AN 2 r(Af+l/2) 



.N 



1 



1 

8iV 



1 



+ 0{N-^) 



128Af2 

(Dl) 

For /3 > 1/2, and a= 1/2 -|- £, we take the large N limit, 
but with £^/N 0. The resulting current across the 
maximal current-low density phase boundary is 



J. 



1 



(2/3- l)2Ar2 



+ 0{N-^) 



+ 



3V? 
32 



52/32 - 52/3 -M7 
8(2/3 - 1)2\/]V 



+ 0{N-^''^) 



e + 0(£2). 
(D2) 
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FIG. 11: (A) FJC and WLC models for iy(r|opcn) for £/a = 
0.2. The WLC distribution approximates that of the FJC 
if the effective number of persistence lengths is reduced. 
This reduction compensates for the stiffness of the chain that 
tends to give more weight at larger distances. {B) FJC and 
WLC distributions for £/a = 1. Note the heuristic cutoff 
applied to the WLC model at r = a. As expected, for equal 
N, the WLC model gives a typically larger separation and 
hence smaller a/R; however, a/R (X N^^^^ for N ^ oo in 
all CEises. (C) The effective end-to-end distance distribution 
Weff constructed from VK(r|open) via equations 14 and 15. 
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